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Abstract 

We present a first-principles computer code package (ABACUS) that is based on density func¬ 
tional theory and numerical atomic basis sets. Theoretical foundations and numerical techniques 
used in the code are described, with focus on the accuracy and transferability of the hierarchial 
atomic basis sets as generated using a scheme proposed by Chen, Guo and He [J. Phys.:Condens. 
Matter 22 , 445501 (2010)]. Benchmark results are presented for a variety of systems include 
molecules, solids, surfaces, and defects. All results show that the ABACUS package with its as¬ 
sociated atomic basis sets is an efficient and reliable tool for simulating both small and large-scale 
materials. 

PACS numbers: 71.15.Ap, 71.15.Mb 
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I. INTRODUCTION 


The density functional theory-- (DFT) based first-principles methods are becoming in¬ 
creasingly important in the research fields of condensed matter physics, material sciences, 
chemistry, and biology. With the rapid development of supercomputers and the advances of 
numerical algorithms, nowadays it is possible to study the electronic, structural and dynam¬ 
ical properties of complicated physical systems containing thousands of atoms using DFT. 
In these cases, the efficiency of widely used plane wave (PW) basis is largely limited, because 
of its extended nature. Instead, local bases, such as atomic orbitals, are the better choices. 

Atomic orbitals have several advantages as basis sets for the ab initio electronic structure 
calculations in the Kohn-Sham scheme.—>2 First, the basis size of atomic orbitals is much 
smaller compared to other basis sets, such as PW or real-space mesh. Second, the atomic 
orbitals are strictly localized and therefore can be combined with either the so-called linear 
scaling algorithms^ for electronic calculations, or any other algorithm with a better scaling 
behavior than 0(N 3 ). For example, Lin et al. have recently developed a so-called Pole 
Expansion and Selected Inversion (PEXSI) technique,which takes advantage of the spar¬ 
sity of the Hamiltonian and the overlap matrices obtained with local orbitals, and allows to 
solve the Kohn-Sham equations with numerical effort that scales as N a (a < 2) for both 
insulating and metallic systems, with N being the number of atoms. 

While the analytical Gaussian-type orbitals have been well established for ab initio calcu¬ 
lations in the quantum chemistry community for decades, the numerically tabulated atomic 
orbitals are getting more and more popular in the computational physics community. Several 
first-principles codes based on the numerical atomic orbitals have been developed in recent 
years, e.g., SIESTA,- OpenMX,' FHI-aims,- to name just a few, which aim at large-scale 
DFT calculations by exploiting the compactness and locality of numerical atomic orbitals. 
However, the numerical atomic orbitals must be constructed very carefully to ensure both 
good accuracy and transferability. Furthermore, it would be highly desirable if the quality of 
the basis sets can be systematically improved in an unbiased way. Recently, some of us [Chen, 
Guo, and He (CGH)] proposed a new scheme^^ to construct systematically improvable op¬ 
timized atomic basis sets for DFT calculations. Based on the CGH procedure for basis set 
generations, we have developed a DFT package from scratch, named Atomic-orbital Based 
Ab-initio Computation at UStc (ABACUS) here in the Key Laboratory of Quantum Infor- 
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mation, University of Science and Technology of China (USTC). In the ABACUS package, 
besides the primary option of using numerical atomic orbitals as basis functions, PW ba¬ 
sis can also be employed as an alternative choice. This dual basis feature is very useful 
for accuracy and consistency checks in benchmark calculations. For both basis set choices, 
the package uses normal conserving pseudopotential in the Unified Pseudopotential Format 
(UPF) that has been used in Quantum ESPRESSO.- 11 The UPF pseudopotentials can be 
generated from the Opium package.- 2 Regarding the exchange-correlation functionals, we 
have implemented the local (spin) density approximation [L(S)DA], and the generalized 
gradient approximation (GGA) as constructed by Perdew, Burke, and Ernzerhof (PBE). 
In addition, semi-empirical van der Waals (vdW) corrected DFT scheme as proposed by 
Grimme (DFT-1)2)-^ has also been implemented. Other advanced functionals such as hy¬ 
brid functionals are currently under development and will be reported in a later work. At 
the level of LDA and GGA, the ABACUS package can do typical electronic structure 
calculations, structure relaxations, and molecular dynamics. 

In this paper, we first describe the main features of the ABACUS package, as well 
as the major techniques that are used to implement DFT algorithms with atomic basis 
sets. In a previous study,- the CGH orbitals have been demonstrated to be accurate and 
transferable for the group IV and group III-V semiconductors. Here, we extend the tested 
systems to a larger range of elements, including the alkali elements, 3d transition metals, 
group VI and group VII elements, with focus on the structural and electronic properties of 
molecules, solids, surfaces, and defects. The results demonstrate that ABACUS with the 
CGH orbitals are highly reliable for both finite and extended systems. In particular, the 
basis set at the level of double-C plus polarization function (DZP) is an excellent choice to 
compromise between accuracy and computational cost, and can be safely used in production 
calculations in most situations. 

The rest of paper is organized as follows. In Sec. [H] we introduce the basic algorithms 
and numerical techniques. In Sec. IIII1 we will demonstrate the performance of the ABA¬ 
CUS package, focus on the accuracy of the atomic orbitals generated using the CGH scheme, 
for a variety of benchmark systems. Finally, we summarize our work in Sec. IV. 
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II. METHODS 


In this section, we first briefly recapitulate the basic formulation of solving Kohn-Sham 
equations in atomic basis (Sec. Ill AD to set up the stage. This is followed by a description of 
the main techniques used in ABACUS. Topics to be covered include the generation of the 
CGH atomic orbitals ISec. Ill Bl) . the construction of Hamiltonian matrix and overlap matrix 
fSec. Ill Cl) , the solvers for Kohn-Sham equations fSec. IIIDl) . and finally the total energy and 
force calculations (Sec. Ill El) . 


A. The Kohn-Sham equation in atomic basis 

The central task in DFT calculations is to solve the Kohn-Sham equation, 

^KS^n(r) = e n T n (r), (1) 

where e n and T„(r) are the Kohn-Sham eigenvalues and eigenfunctions for state n. Hartree 
atomic unit (e = h = m e = 1) is used here and throughtout the paper. The Kohn-Sham 
Hamiltonian Hks can be written as, 

H ks = T + U ext (r) + V H [p{ r)] + V xc [p{y )\, (2) 

where T = — |V 2 , U ext (r), V H [p( r)], and V xc [p{ r)] are the kinetic energy operator, the ex¬ 
ternal potential, the Hartree potential, and the exchange-correlation potential, respectively. 
The Kohn-Sham Hamiltonian s thus depends on the electron density p(r), which can be 
determined from the occupied Kohn-Sham orbitals 

occ. 

p(r) = 2 51 l'M r )l 2 - ( 3 ) 

n= 1 

Here for simplicity we assume that the system is spin-degenerate, and hence the spin index 
is omitted. Extending the algorithm described here to the spin-polarized case is straightfor¬ 
ward and has been implemented in ABACUS. 

Norm-conserving pseudopotentials are used to describe the ion-electron interactions. The 
external potential U ext (r) in Eq. (|2]) contains the summation of the ion-electron potentials 
of all atoms plus, when they exist, applied external potentials. Therefore (in the absence of 
the applied external potential), 

( 4 ) 

R ai 
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where is a norm-conserving pseudopotential- 14 for the i-th atom of element type a, and 
T a i is the atomic coordinate in the cell R. The pseudopotential can split into a local part of 
the potential v „ and separable fully non-local potentials^ v^ L , 


yP s =v L + v NL 

^ rv ^ rv 1 '-'rv 


( 5 ) 


The applied external potentials, e.g ., electric fields, can be easily added to the local part of 
the potential, while the non-local pseudopotential can be written as, 

Imax l Tlmax 

P a ^ ^ 'y ^ y ' \Xalmn) {Xalmn | 5 (6) 

1=0 m=—l n= 1 

where | Xalmn) are non-local projectors, with l, m , n being the angular momentum, the 
magnetic momentum, and the multiplicity of projectors, respectively. In Eq. ([6]), l max and 
nmax are the maximal angular momentum and the maximal multiplicity of projectors for 
each angular momentum channel, respectively. 

The Kohn-Sham equation is usually solved within certain basis sets. The ABA¬ 
CUS package offers two choices of basis sets: the PW basis set and the atomic basis set. 
The advantage to do so is that the results obtained using atomic basis sets can be directly 
compared to those obtained from PW basis sets for small systems, and thus provides valu¬ 
able benchmarks for the former. This will be clearly seen in Sec. IIIII where the benchmark 
results for a variety of systems are presented. However, since the PW algorithm has been 
well developed and documented, here we only focus on the algorithms of the atomic-basis 
implementation. 

Without losing generality, we consider crystalline systems under periodic boundary con¬ 
ditions. The Kohn-Sham eigenfunctions in Eq. (JTJ) then become Bloch orbitals which, within 
atom-centered basis set, can be expanded as, 

'hnk(r) = —!= Y C «M,k e * k R 0/i( r - Tai ~ R), (7) 

' 1 R M 

where — r a j — R) are the atomic orbitals centering on the Tth atom of type a in the 
unit cell R. The orbital index /i is a compact one, /i = with l being the 

angular momentum, m the magnetic quantum number, and ( the number of atomic orbitals 
for a given l. Here n and k are the band index and Bloch wave vector, and c^k are the 
Kohn-Sham eigen-coefficients. Finally N is the number of unit cells in the Born-von-Karmen 


5 


supercell under the periodic boundary conditions. Using Eq. (J7J), the electron density within 
atom-centered basis sets can be computed as 


p(r) 


k(r) 

■LVk ! 

nk 

/nk< nik c ni/) ke _jk ' R 0*(r - T ai - R)0i/(r - r Pj ) 

^ R liv nk 

X] ~ T cd ~ R )<M r - T Pj ), 


( 8 ) 


R iiv 

where f n k is the Fermi occupation factor, and Nk is the number of k points in the Brillouin 
zone (BZ) sampling, which is typically equivalent to the number of real-space unit cells N 
in the Bloch summation. p fiu ( R) in Eq. [8] is the density matrix in real space, defined as 


/V( R ) = k c ^,k e lkR ( 9 ) 

fc nk 

Please note that in the last line of Eq. HJ we have assumed, without losing generality, the 
atomic orbitals to be real, he., 0* = 0 M . 

Given the expansion of the Kohn-Sham states in terms of atomic orbitals in Eq. (JTj), the 
Kohn-Sham equation Eq. ([TJ) becomes a generalized eigenvalue problem, 


H{ k)c k = E k S( k)c k , (10) 

where H( k), 5(k) and Ck are the Hamiltonian matrix, overlap matrix and eigenvectors at 
a given k point, respectively. is a diagonal matrix whose entries are the Kohn-Sham 
eigenenergies. To obtain the Hamiltonian matrix H( k), we first calculate 

H»v( R) = (0^r|T + U ext + V H + V XC \M, (11) 

where /i, v are atomic orbital indices within one unit cell, and 0^R=0 M (r — R — r Q j), 
0t/o=0t/(r — Tpj). The Hamiltonian matrix at a given k point can be obtained as, 

k) = ^e- ikR ^(R). (12) 

R 

Similarly, the overlap matrix at a given k point is obtained as, 

^(k) = ^e- ikR ^(R), (13) 

R 
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where, 


'S)ui'(-R') (0/iR l^vo) • 


(14) 


The construction of H( k) and S'(k), as well as solving Eq. (ITOl) take most of the computa¬ 
tional time. These two aspects will be discussed in more details in Secs III Cl and ITTDl 
In many cases, when the investigated unit cell is large enough, a single T-point in the 
BZ is enough to get converged results. In these cases, both the Hamiltonian and overlap 
matrices are real symmetric matrices. In the ABACUS package, we treat the T-point only 
calculations separately to improve the efficiency. 


B. Systematically improvable atomic basis sets 


Before going into the construction processes of H flv ( R) and S fiU ( R), here we introduce 
the CGH atomic orbitals that are used in ABACUS. The quality of atomic basis is essential 
to obtain accurate results. Unlike PW basis, with which the quality of the calculations can 
be systematically improved by simply increasing the PW energy cutoff, the way to generate 
high-quality atomic basis functions is much more complicated. In the last decades, con¬ 
siderable efforts have been devoted to developing high quality atomic orbitals. 7 dS~— ABA¬ 
CUS adopts a scheme proposed by CGH- to generate systematically improvable, optimized 
atomic basis sets. 

An atomic basis function can be written as a radial function multiplied by spherical 
harmonics (in practice we use solid spherical harmonic functions, which are real functions), 

(pimd r) = f K {r)Y lm {r ), (15) 


where the indices l,m, and ( have the usual meanings of angular momentum quantum 
number, magnetic quantum number, and the multiplicity of the orbitals for l. One usually 
needs more than one radial functions for each angular momentum to improve the quality 
and transferability of the atomic basis sets. 

In the CGH scheme, the radial function fi((r) is expanded in terms of a set of spherical 
Bessel functions (SBFs), with the coefficients of the SBFs yet to be determined, i.e., 


h ( r ) 


Y. q c KqiM r )i r <r c 
0 r > r c , 


(16) 
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where ji(qr c ) is the SBF with radius cutoff r c . The possible q values are chosen such that 
ji{q r c)— 0. A kinetic energy cutoff is chosen to determine the maximal value of q, and thus the 
number of SBFs. We set the SBFs to be strictly zero beyond the radius cutoff r c . In fact, 
they have been used directly as short-ranged basis set in first-principles calculations.—*^ 
Because (almost) any function within the radius cutoff r c can be represented as a linear 
combination of SBFs, this gives us a large number of degrees of freedom for optimizing the 
atomic basis set. 

To obtain optimized atomic basis, we vary the coefficients of the SBFs to minimize the 
spillage between the atomic basis set and a set of selected reference systems. The spillage S 
is a positive number defined as the difference between the Hilbert space spanned the atomic 
basis set and the wave functions calculated by PW basis, 

1 N n N k 

S = N Nv “ ^kl'Prck), 

iV ™ JVk n=l k=l 

where P is a projector spanned by the atomic orbitals, 

A = X>*k>S- 1 (k)<0,, k |. (18) 

Here, S -1 (k) is the inverse of overlap matrix S(k) between numerical atomic orbitals. The 
spillage has been proposed before to measure the quality of a set of atomic basis.——-— 
We then use simulated annealing method to determine the coefficients that minimize the 
spillage. Sometimes the numerical orbitals obtained from this procedure have unphysical 
oscillations, which may lower the transferability of the basis set. To eliminate these unphys¬ 
ical oscillations, we further minimize the kinetic energy of each atomic orbital while keeping 
the spillage almost unchanged. More details of this procedure can be found in Ref. 0. 

The reference systems used in the basis-generation procedure are very important to ensure 
the transferability of numerical atomic orbitals. The CGH scheme allows the users to choose 
freely the target systems to generate high-quality basis sets for different purposes. From 
our experience, homonulclear dimers are very good reference systems^*^ for generating 
transferable atomic orbitals for general purposes. To avoid any possible bias of the basis 
set towards certain geometrical structure, an average over dimers of several different bond 
lengths (compressed or elongated) is taken as the target in the optimization procedure. We 
provide scripts to generate the atomic bases using diatomic molecules as reference systems. 





The scripts set up PW calculations provided in the code for the dimers at various bond 
lengths. We remark that after the atomic basis sets are generated, for consistency, one must 
use the same pseudopotential and energy cutoff in later atomic orbitals based calculations 
as those used in the basis generation. 

The CGH scheme is very flexible and easy to implement. One can choose freely the 
angular momentum of the orbitals, and the multiplicity of the radial functions for each 
angular momentum. All atomic orbitals are generated from the same procedure and criteria. 
These orbitals form a sequence of hierarchial basis sets, which have a systematic convergence 
behavior towards the PW reference. Furthermore, without any assumptions of the shapes 
of radial functions /^(r), in principle we can get the fully optimized radial functions. As 
will be shown in Sec. m the atomic orbitals generated in this way indeed show excellent 
accuracy and transferability for various systems. 

C. Hamiltonian and overlap matrices construction 

As mentioned above, in order to construct H( k) and S'(k) at given k points using Eqs. (1I2| ) 
and (TT3|) . we need to first calculate H^ u ( R) and S^R). During the processes, we take the 
full advantage of the short-range nature of the atomic orbitals, he., only the matrix elements 
whose corresponding atomic orbitals have non-zero overlaps are evaluated. This is because 
each matrix element could be written as an integral in real-space grids, if this integral involves 
two spatially well-separated basis functions that are not overlapping, then the result of the 
integral should be zero. This feature leads to a sparse matrix and then O(N) scaling of the 
number of integrals, which is a significant advantage compared to PW based methods. 

Now we briefly discuss how each term in H^ W (TV) and S^R) is calculated. As is clear 
from Eq. (fTTlh the Hamiltonian matrix has several components, which are computed by 
two different techniques, namely the two-center integral technique and the grid integral 
technique, respectively. First, the kinetic energy matrix T fiu ( R) = (</> M R|T|0i,o)> the non-local 
pseudopotential matrix V^ L (R) = {(ft f _ l -R,\V NL \(j) u0 ), as well as the overlap matrix S^R) = 
(< pfinl^vo ) can be efficiently calculated by the two-center integral technique,— which has 
been described thoroughly in Ref. [(3. The two-center integrals can be split into two parts: 
a one-dimensional integral over radial functions and an angular integral involving spherical 
harmonic functions. The radial integrals are tabulated for a wide range of distances between 
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two orbitals once for all. The value for two orbitals within a reasonable distance can then 
be interpolated from the table. The angular integral involving spherical harmonic functions 
leads to quantities of so-called the Gaunt coefficients, which can also be easily calculated. 
Therefore, the two center integrals can be evaluated very efficiently.- Further details on the 
two-center integral technique are given in Appendix IA 11 

The matrix elements of the local potentials, V^“ C (R) = (0 m r| with 

V loc (r) = V L (v) + V H (v) + V xc (r) (19) 

are evaluated on a uniform grid in real space. Here U L (r) is the sum of all local pseu¬ 
dopotentials. We first evaluate the local potential V loc (r) on each grid point: the local 
pseudopotentials and Hartree potentials are calculated using techniques adapted from PW 
basis, he., they are first calculated in the reciprocal space, and then Fourier transformed 
to the real-space grid. The exchange-correlation potential can be directly evaluated on the 
real-space grid. Once we have V loc (r), the matrix elements V)(° C (R) are directly summed 
over the real-space grid. More details on the grid-based integral technique for the local po¬ 
tentials are given in Appendix IA 21 The grid integrals are one of the most time consuming 
parts in the algorithms based on atomic orbitals. However, the computation efforts of the 
grid integrals only scale linearly with the system size, and can be easily parallelized. 

D. Kohn-Sham equation solvers 

After the Hamiltonian and overlap matrices are constructed, the Kohn-Sham equations 
are solved separately at each k-point, which amounts to solving a generalized eigenvalue 
problem. This constitutes the major computational bottleneck for systems larger than a few 
hundreds of atoms. Standard diagonalization method scales as 0(N 3 ), where N is the matrix 
dimension. There are a few parallel matrix eigenvalue solvers available. ABACUS uses 
a package named High Performance Symmetric Eigenproblem Solvers (HPSEPS) developed 
by the Supercomputing Center of Chinese Academy of Science, to diagonalize the Kohn- 
Sham Hamiltonian.— HPSEPS provides parallel solvers for generalized eigenvalue problems 
concerning large dimensions of matrix. 

Recently, Lin et al.—~— developed the PEXSI technique, which provides an alternative 
way for solving the Kohn-Sham problem without using a diagonalization procedure. Com- 
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pared to linear scaling approach, PEXSI does not rely on the nearsightedness principle either 
to truncate density matrix elements. In a T-point calculation, the basic idea of PEXSI can be 
illustrated as follows. Denote by M the number of atomic orbitals, <h(r) = [^(r), • • • , (ft M (r)] 
the collection of all atomic orbitals in the real space, and y(r, r') the single particle density 
matrix in the real space. Then the PEXSI approach first expands y(r, r') using a P-terrn 
pole expansion as 


7(r, r') = $(r)r$*(r / ) 


Ri <h(r)Im 



OJi 


H — (zi + €f)S 


<r 


( 20 ) 


Here II. S, P are the Hamiltonian matrix, the overlap matrix and the single particle density 
matrix represented under the atomic orbital basis set <h, respectively. €f is the chemical po¬ 
tential or Fermi energy. The complex shifts {^} and weights {cuf} are determined through a 
simple semi-analytic formula, and takes negligible amount of time to compute. The number 
of terms of the pole expansion is proportional to log(/3AE), where /3 is the inverse of temper¬ 
ature and A E is the spectral radius, which can be approximated by the largest eigenvalue 
of the ( H , S ) matrix pencil. The logarithmic scaling makes the pole expansion a highly 
efficient approach to expand the Fermi operator. 

At Erst it may seem that the entire Green’s function-like object [H — (zi + 
needs to be computed. However, if we target the electron density p(r) = y(r, r), then only 
{[(H — (zi + £F)S)^ 1 ] fll/ 1 7 ^ 0} are actually needed. A selected inversion algorithm can 
be used to efficiently compute these selected elements of the Green’s function, and therefore 
the entire electron density. The computational cost of the PEXSI technique scales at most 
as 0(N 2 ). The actual computational cost depends on the dimensionality of the system: 
the cost for quasi-ID systems such as nanotubes is 0(N ) i.e. linear scaling; for quasi-two- 
dimensional systems such as graphene and surfaces (slabs) the cost is (9(A 1,5 ); for general 
three-dimensional bulk systems the cost is 0(N 2 ). This favorable scaling hinges on the sparse 
character of the Hamiltonian and overlap matrices, but not on any fundamental assumption 
about the localization properties of the single particle density matrix. This method is not 
only applicable to the efficient computation of electron density, but also to other physical 
quantities such as free energy, atomic forces, density of states and local density of states. 
All these quantities can be obtained without computing any eigenvalues or eigenvectors. For 
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instance, the atomic force for atom i in species a can be computed as 


F ni & -Tr 


r 


dH 

dr ni 


+ Tr 


; 8S 
dr ni 


( 21 ) 


where the hrst part is independent of PEXSI algorithm and will be discussed in the next 
subsection. The second term in Eq. [21] depends on the energy density matrix, which is 
written as 


Im £ wz 


cor 


1=1 


H — (zi + £f)S 


( 22 ) 


This matrix is given again by pole expansions with the same poles as those used for com¬ 
puting the charge density, with different weights {oaf}. For more detailed information we 


refer readers to Refs. 


29 


and 


30 


In order to use the PEXSI technique for multiple k-point 


calculations, we need to work with the Green’s function of a non-Hermitian Hamiltonian 
that is only structurally symmetric. The massively parallel selected inversion method for 
non-Hermitian but structurally symmetric matrices are currently under development, and 
will be integrated into ABACUS in the future to perform large scale electron structure 
calculations with multiple k-point sampling. 

Compared to existing techniques, the PEXSI method has some notable features: 1) The 
efficiency of the PEXSI technique does not depend on the existence of a finite Highest 
Occupied Molecular Orbital (HOMO)-Lowest Unoccupied Molecular Orbital (LUMO) gap, 
and can be accurately applied to general materials systems including small gapped systems 
and metallic systems. The method remains accurate at low temperatures. 2) The PEXSI 
method has a two-level parallelism structure and is by design highly scalable. The recently 
developed massively parallel PEXSI technique can make efficient use of 10, 000 ~ 100, 000 
processors on high performance machines. 3) As a Fermi operator expansion based method, 
PEXSI allows the use of a hybrid scheme that combines density of states estimation based 
on Sylvester’s law of inertia with Newton’s method to obtain the chemical potential. This 
is a highly efficient and robust approach with respect to the initial guess of the chemical 
potential, and is independent of the presence of gap states. 4) PEXSI can be controlled with 
a few input parameters, and can act nearly as a black-box substitution of the diagonalization 
procedure commonly used in electronic structure calculations. 

In order to benefit from the PEXSI method, the Hamiltonian and overlap matrices must 
be sparse, and this requirement is satisfied when atomic orbitals are used to discretize the 
Kohn-Sham Hamiltonian. The sequential version of PEXSI has been demonstrated before 
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with ABACUS ,— and the massively parallel version of PEXSI is recently integrated with 
SIESTA— for studying large scale systems with more than 10, 000 atoms with insulating 
and metallic characters on more than 10, 000 processors. The parallel PEXSI method is to 
be integrated with ABACUS . 

E. Total energy and force calculations 

Once the Kohn-Sham equation is solved, one can obtain the total energy of the system 
using the Harris functional,— 

E tot = E band _ J v Hxc (r)p(r)dr + E H + E xc + E 11 , (23) 

where E band is the Kohn-Sham band energy, which is the summation over occupied Kohn- 
Sham orbital energies. E H , E xc and E 11 are the Hartree energy, the exchange-correlation 
energy, and the Coulomb energy between ions respectively. The second term in the above 
equation is the so-called double-counting energy arising from the Hartree and exchange- 
correlation potential, which have been included in the band energy term. If the Kohn-Sham 
equation is solved by matrix diagonalization, then the band energy is the summation of the 
Kohn-Sham eigenvalues e„k of all occupied bands, i.e., 

2>.tc»k. (24) 

k nk 

Alternatively, if the PEXSI method is chosen to be the Kohn-Sham equation solver, the 
band energy is calculated as (only for T-point now) 

E band = Tr [ p #] ) (25) 

where p is the density matrix and H is the Hamiltonian matrix. The Hartree and exchange- 
correlation energies are calculated on a uniform real-space grid, and E n are calculated by 
the Ewald summation technique.— 

The forces acting on the ions are given by the derivative of the total energy with respect 
to the atomic coordinates. Analytical expressions for forces computed with atomic basis sets 
are more sophisticated than those with the PW basis sets. Besides the Feynman-Hellmann 
forces, Pulay forces^ due to the change of the atomic basis sets during a structural relaxation 
should also be considered. 
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Therefore, we rewrite the total energy as the sum of two parts: E tot = E KS + E n , where 
E ks is the electronic part of the total energy in Eq. [23l while E n is the energy due to the 
Coulomb interactions between ions. Then the total force experienced by the i-ion of type a 
is 


F • = 

-*■ rv 7 . 


dE tot 


dE KS dE n 


Ot • dr ' c)t ■ 

' Oil ^ 1 Oil ^ 1 CXI 

After some derivations (more details are given in Appendix IA 31) . we arrive at 


dE KS 

dr ni 




R f-LV 


dr n 


EE 

R fiv 


dp, 


C' / rvi 


(26) 


(27) 


with 


dH^( R 


dH 




dr n 


~ (M-k— \<f>vo) + (-o 

^ ‘ Oil V 1 Oil 


H\M + <0„ R |i/|^). 

OTo/i 


(28) 


Note that /i = {a, i, l, m, C} is the compact index for the atomic orbitals. The first term of 
the above equation is the Feynman-Hellmann force,— whereas the rest two terms yield the 
so-called Pulay forces.— Following Ref. (], one can prove that the term related to j g 


EE 

R iii/ 


dpfwCR) 

dr m 


H r = EE b - 

R fiv 


„,«V(R) 

g T . 

^ 1 rv 7. 


(29) 


where, 

E^(R) = Ue nk c; nM c nuM e- ,k - R , (30) 

is the element of “energy density matrix”. In the above equation, e n k is the band energy for 
band n at wave vector k. This term arises because the atomic orbitals are not orthogonal. 

Similar to the total energy evaluation, different force terms are also evaluated using 
different techniques to maximize the efficiency of force calculations. Specifically, due to the 
long range tail of local pseudopotential in real space, it is better to calculate it in reciprocal 
space using PW basis set, and this is how Feynman-Hellmann force associated with the 
local pseudopotentials is implemented. Another advantage of this implementation is that 
the derivative of the local pseudopotential can be easily done in reciprocal space. By taking 
advantage of the short-range character of the non-local pseudopotential operators and atomic 
orbitals, the force terms including the Feynman-Hellmann force arising from the non-local 
pseudopotential operator, the Pulay forces arising from the kinetic energy operator, as well 
as the non-orthogonal forces, are calculated using two-center integral techniques. Finally, 
the Pulay forces associated with the local potentials are evaluated by grid integrals. Here 
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the local pseudopotentials are first evaluated in reciprocal space and Fourier transformed to 
a real-space mesh. Technical details on the force calculations can be found in Appendix IA 31 
With the capability of calculating forces efficiently, a structural relaxation can be done 
by searching the local minimum in potential energy surface. Two algorithms for struc¬ 
tural relaxations are implemented, namely the Broyden-Fletcher-Goldfarb-Shanno (BFGS) 
method^ and the conjugate gradient (CG) method.— 
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III. RESULTS 


In this section, benchmark results obtained from ABACUS are presented for a variety 
of systems, including molecules, crystalline solids, surfaces, and defects. The tested atomic 
species cover both main-group elements and transition metal elements. In particular, the 
convergence of calculated physical properties are tested with respect to the size of CGH or¬ 
bitals. The tested basis sets form a hierarchy by spanning from single-^ (SZ), double-^ (DZ), 
double-^ plus polarization functions (DZP), to triple-^ plus double polarization functions 
(TZDP), and quadrupole-C plus triple polarization functions (QZTP). In the following tests, 
we refer to these basis sets as atomic basis sets, or equivalently, as linear combination of 
atomic orbitals (LCAO). The references for the tested properties are chosen to be those cal¬ 
culated by converged PW basis set with the same pseudopotentials. Available experimental 
results are also included for comparisons. 

The naming of basis sets depends on the valence electrons of each element. For example, 
SZ refers to a single s orbital for elements that have only s valence electrons, such as alkali 
metal elements. For elements that have p valence electrons, SZ refers to a single s orbital 
plus three p orbitals, such as first- and second-row non-metal elements. Furthermore, for 
transition metal elements, SZ refers to one s orbital, three p orbitals plus five d orbitals. 
Here the number of orbitals on each angular momentum channel is (2/ + 1), where l is the 
angular momentum quantum number. The polarization functions refer to orbitals that have 
higher l than the maximal one used in a SZ basis set. Specifically, in a SZ basis set that 
contains only one s orbital, the p orbitals are referred to the polarization functions; in a 
SZ basis set that has both s and p orbitals, the d orbitals are indicated as the polarization 
functions. Finally, for transition metals which use all s, p, d orbitals in a SZ basis set, the 
/ orbitals are the polarization functions. 

Two more parameters are needed to define a CGH atomic orbital. First, because CGH 
atomic orbitals are generated by an optimization procedure that is based on the results from 
PW calculations of target systems (typically diatomic molecules here), thus the generated 
set of atomic orbitals depend on a specific energy cutoff (E cut ) used in PW calculations. 
Second, all CGH atomic orbitals are enforced to be strictly localized within a radius R cu t, 
beyond which the atomic orbitals are set to be exactly zero. Table [T] lists both parameters 
for 24 elements that will be used in the followings to test physical properties of systems. 
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TABLE I: The energy cutoff E CVlt (in Ry) and radius cutoff R cu t (in Bohr) parameters of the LCAO 
basis functions for 24 different elements used in this paper. 


Element E cut (Ry) R cut (Bohr) Element E cut (Ry) R cut (Bohr) 


H 

50 

6 

Cl 

50 

8 

Li 

30 

12 

Ti 

100 

10 

C 

50 

8 

Fe 

100 

10 

N 

50 

8 

Cu 

100 

8 

0 

50 

8 

Ga 

50 

9 

F 

50 

8 

Ge 

50 

9 

Na 

20 

12 

As 

50 

9 

Mg 

20 

12 

Br 

50 

9 

A1 

50 

9 

Br 

50 

9 

Si 

50 

8 

In 

50 

9 

P 

50 

9 

Sb 

50 

9 

S 

50 

8 

I 

50 

9 


Compared to a PW basis set which can be systematically increased to reach arbitrary 
accuracy in a calculation, the accuracy of the existing LCAO basis sets are known to be 
difficult to improve systematically. However, our construction strategy for numerical atomic 
orbitals described in Sec. I11BI can in principle guarantees a systematic convergence towards 
the PW accuracy for the target system. We note that, for general systems, the convergence 
behavior of an atomic basis set should be checked a posteriori. This is carried out separately 
for molecules in Sec. 1111 Bl and for solids in Sec. 1111 Cl Among all these tests, first of all, let us 
look into one numerical issue that is very common in atomic-orbital based calculations - the 
so-called eggbox effect^ - which occurs when the integrals of matrix elements are evaluated 
on a finite, uniformly spaced real-space grid. 

A. Eggbox effect 

The eggbox effect refers to the artificial rippling of the ground-state total energy as 
a function of the atomic displacements relative to an uniform real-space grid points.— 1 ^ 
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Specifically, it arises from the numerical errors in evaluating the integrals of a Hamiltonian 
operator with respect to the local orbitals on a finite uniform real-space grid. This effect is 
completely artificial but considerably complicates the calculations of forces acting on atoms 
and phonon dispersions. Naturally, the denser the real-space grid is - corresponding to a 
higher energy cutoff in the reciprocal space, the smaller in magnitude that the rippling of the 
ground-state total energy will be. Conversely, if an atomic orbital in the real space is designed 
in a cautious way that the high-energy components of its Fourier transform are suppressed, 
then a less denser real-space grid is needed. Based on this principle, Anglada and Soler 
proposed an efficient filtering procedure 3 - to effectively suppress the high-energy components 
of their local orbitals, without sacrificing the locality of these basis functions. This procedure 
has been shown to work well in the SIESTA package.— In the ABACUS package, our basis 
functions are automatically confined in the reciprocal space below a certain energy cutoff 
during the construction processes. 

Consequently, we show the simulated results for diatomic molecules SB, O 2 , and Mn 2 at 
their equilibrium distances in Fig.[D Unless otherwise stated, the molecular calculations are 
all done under the periodic boundary conditions. A cubic box with a side length of 20 Bohr 
is chosen for all three molecules. The atomic basis set is chosen to be DZP. As illustrated 
in Fig. [U the eggbox effects are exceedingly small for these molecules. In particular, for Si 2 
and 0 2 , the oscillations of the total energy are within 1 meV, while the force oscillations are 
within 1 meV/A. The oscillation of force for Mn 2 is most pronounced, but is still within 10 
meV/A. This accuracy is sufficient for most practical purposes. On top of these oscillation 
patterns, the additional wiggling patterns of the energy and force curves for 0 2 and Mn 2 
have not been well understood, but these only occur at a smaller energy scale and have not 
caused any problems so far. As will be shown in Sec lIII Dl accurate structural relaxations 
can be carried out without further corrections for the eggbox effect. 

B. Molecules 

Having the eggbox effect under control, we now look into the convergence quality of atomic 
basis set for small molecules. Similar to the eggbox test cases, a cubic cell with a side length 
20 Bohr is chosen here to avoid the artificial interactions between a molecule and its images. 
Taking N 2 as an example, we plot its ground-state energy versus its bond distance in Fig. [21 
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Atomic displacement 


FIG. 1: (Color online) The eggbox effect for Si 2 , O 2 , and M 112 molecules. The unit in the x axis is 
the spacing between neighboring real-space grid points, which amounts to 0.21 Bohr for an energy 
cutoff of 50 Ry. 

The hierarchical basis sets using in this calculation range from SZ to QZTP. The reference is 
obtained from PW calculations. As illustrated in In Fig. [2l as the size of the atomic basis set 
increases, the total energy of the N 2 molecule converges systematically towards the PW limit. 
From these curves, one can deduce the equilibrium bond length, atomization energy, and 
vibrational frequency of the N 2 molecule. These quantities obtained with atomic basis sets 
for a selected molecular set can be used to validate the convergence quality of the localized 
basis sets, when compared to the corresponding results obtained from PW calculations. 

In the followings, we present the benchmark results for bond lengths, atomization ener¬ 
gies, and vibrational frequencies for 11 chemically bonded diatomic molecules. Followed by 
the interaction energies of the S22 molecular test set,— obtained by the DFT-D method^ 
as implemented in ABACUS. The first test is used to validate our methods for chemically 
bonded dimers while the second one is for weakly bonded systems. 


1. Bond lengths 


The bond length of a molecule is an important quantity. In Table HU the calculated bond 
lengths of 11 diatomic molecules are presented for atomic basis sets range from SZ to QZTP. 


The PW results are also shown, and the experimental data are taken from Refs. 
Both LDA^— and PBE 44 exchange-correlation functionals are used. 


41 


and 


42 
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FIG. 2: (Color online) Total energy of the N 2 molecule as a function of the bond length for a 
sequence of increasing LCAO basis sets. The PW results are also shown for comparison. 

As shown in Tabic [TO all the calculated equilibrium bond lengths systematically approach 
the corresponding PW results with increased atomic orbital basis sets. Specifically, the mean 
absolute error (MAE) for the DZP basis set is 0.018 A for both LDA and PBE calculations. 
This accuracy is sufficiently good for most practical purposes. When going beyond DZP to 
TZDP and QZTP basis sets, the MAEs become even smaller; at the QZTP level, the MAE 
is only 0.004 A for LDA and 0.003 A for PBE. 

For alkali-metal elements, the construction of high-quality localized atomic orbitals^ 1 ^ is 
highly challenging because these orbitals tend to be very diffusive and have a longer tail 
than the atomic orbitals of other elements. However, by using CGH orbitals, we found that 
a rather satisfactory description of the molecular bonding involving alkali metal atoms can 
be achieved, as can be seen from the examples of Na 2 and LiH in Table [TT1 if a large cutoff 
radius, i.e., between 10 Bohr and 12 Bohr, is used. The same observation holds for alkali 
metal elements in bulk materials, as will be shown in Sec. IIII Cl 

The experimental values in Table [XT] are only shown for comparison purpose, and not for 
benchmark purpose. In all tests of bond lengths, as expected, the converged LDA bond 
lengths are systematically smaller than the corresponding experimental values, while the 
converged PBE values show the opposite behavior. 
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TABLE II: Bond lengths (in A) of diatomic molecules obtained with various sets of atomic orbitals, 
in comparison with PW results and experimental data (EXP). Both LDA and PBE results are 
shown. The MAEs of atomic basis sets are obtained with reference to the PW results. 


Molecules 

SZ 

DZ 

DZP 

TZDP 

QZTP 

PW 

EXP 




LDA 




n 2 

1.227 

1.121 

1.107 

1.098 

1.096 

1.095 

1.098 

0 2 

1.086 

1.132 

1.192 

1.195 

1.196 

1.198 

1.208 

S 2 

1.683 

1.724 

1.852 

1.869 

1.870 

1.871 

1.889 

F 2 

1.304 

1.331 

1.398 

1.402 

1.402 

1.405 

1.412 

Cl 2 

1.848 

1.877 

1.932 

1.949 

1.951 

1.952 

1.988 

Br 2 

2.035 

2.184 

2.211 

2.226 

2.233 

2.240 

2.281 

h 

2.479 

2.563 

2.608 

2.623 

2.634 

2.641 

2.665 

Li 2 

2.503 

2.570 

2.627 

2.639 

2.639 

2.642 

2.673 

Na 2 

2.901 

2.972 

3.028 

3.038 

3.041 

3.053 

3.079 

CO 

1.271 

1.157 

1.136 

1.125 

1.125 

1.123 

1.128 

LiH 

1.659 

1.688 

1.621 

1.597 

1.597 

1.599 

1.595 

MAE 

0.137 

0.072 

0.018 

0.006 

0.004 

/ 

/ 




PBE 




n 2 

1.253 

1.167 

1.109 

1.103 

1.103 

1.101 

1.098 

0 2 

1.289 

1.251 

1.225 

1.218 

1.214 

1.211 

1.208 

S 2 

1.981 

1.929 

1.903 

1.892 

1.891 

1.891 

1.889 

F 2 

1.319 

1.352 

1.402 

1.413 

1.416 

1.418 

1.412 

Cl 2 

2.195 

2.087 

2.019 

2.006 

2.003 

2.001 

1.988 

Br 2 

2.457 

2.396 

2.330 

2.313 

2.304 

2.292 

2.281 

h 

2.837 

2.782 

2.718 

2.693 

2.681 

2.674 

2.665 

Li 2 

2.770 

2.710 

2.699 

2.690 

2.690 

2.687 

2.673 

Na 2 

3.265 

3.187 

3.103 

3.094 

3.094 

3.092 

3.079 

CO 

1.140 

1.133 

1.130 

1.129 

1.129 

1.129 

1.128 

LiH 

1.748 

1.668 

1.612 

1.608 

1.607 

1.605 

1.595 

MAE 

0.133 

0.065 

0.018 

0.007 

0.003 

/ 

/ 
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TABLE III: Atomization energies (in eV) of molecules obtained with various sets of atomic orbitals, 
in comparison with PW and experimental results (EXP). The MAEs of atomic basis sets are 
obtained with reference to the PW basis set. 


Molecule 

SZ 

DZ 

DZP 

TZDP 

QZTP 

PW 

EXP 




LDA 




n 2 

6.882 

9.566 

11.007 

11.162 

11.183 

11.193 

9.759 

0 2 

5.636 

6.495 

7.437 

7.491 

7.506 

7.542 

5.117 

S 2 

3.990 

4.540 

4.850 

4.944 

4.985 

5.006 

4.370 

F 2 

0.746 

0.995 

1.834 

1.876 

1.886 

1.917 

1.601 

Cl 2 

1.430 

1.648 

2.871 

2.912 

2.912 

2.943 

2.480 

Br 2 

1.698 

2.086 

2.342 

2.381 

2.393 

2.412 

1.971 

h 

1.179 

1.457 

1.931 

1.943 

1.965 

1.984 

1.542 

Li 2 

1.323 

1.135 

1.111 

1.106 

1.104 

1.083 

1.037 

Na 2 

1.134 

0.998 

0.964 

0.933 

0.933 

0.902 

0.735 

CO 

10.252 

11.099 

12.722 

12.741 

12.754 

12.758 

11.108 

LiH 

1.585 

2.218 

2.633 

2.684 

2.684 

2.664 

2.415 

MAE 

1.408 

0.769 

0.080 

0.034 

0.022 

/ 

/ 




PBE 




n 2 

6.389 

8.727 

10.375 

10.592 

10.592 

10.623 

9.759 

o 2 

4.712 

5.183 

6.043 

6.133 

6.145 

6.190 

5.117 

S 2 

4.032 

4.467 

4.602 

4.664 

4.747 

4.788 

4.370 

F 2 

0.852 

1.239 

1.786 

1.815 

1.820 

1.834 

1.601 

Cl 2 

1.552 

2.196 

2.682 

2.714 

2.721 

2.734 

2.480 

Br 2 

1.329 

1.783 

2.024 

2.065 

2.086 

2.105 

1.971 

I 2 

1.085 

1.293 

1.642 

1.685 

1.704 

1.728 

1.542 

Li 2 

1.306 

1.124 

1.096 

1.088 

1.087 

1.062 

1.037 

Na 2 

1.106 

0.943 

0.902 

0.881 

0.881 

0.850 

0.735 

CO 

10.622 

10.969 

11.518 

11.569 

11.576 

11.609 

11.108 

LiH 

2.975 

2.684 

2.643 

2.622 

2.612 

2.601 

2.415 

MAE 

1.083 

0.545 

0.097 

0.041 

0.026 

/ 

/ 
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2. Atomization energies 


The atomization energy refers to the energy cost to split a molecule into individual atoms. 
It is an important property in thermochemistry. The benchmark for the atomization energy 
are done using the same diatomic molecular set and hierarchal atomic orbitals as used for 
bond-length tests. The results are presented in Table IIIII For open-shell atoms, the spin- 
polarized, symmetry-broken solution usually has the lowest ground-state energy, and is thus 
taken here as the reference of calculated atomization energy. As can be seen from Table IIIII 
for all molecules, the atomization energies obtained with the atomic basis sets converge 
systematically towards the PW limit. 

Quantitatively, however, the accuracy of atomization energies obtained with atomic basis 
sets is not so spectacular as the one of geometrical properties. In particular, the atomization 
energies from SZ basis are indeed too small; the corresponding MAE is over 1 eV. The MAE 
is improved by about a factor of 2 when the basis set goes from SZ to DZ, but is still far 
from satisfactory. A dramatic improvement is achieved at the DZP level where a MAE 
around 0.1 eV is an acceptable accuracy for most practical purposes. Nevertheless, to reach 
the “chemical accuracy” (1 kcal/mol ~ 0.043 eV), one has to go further in the hierarchy of 
atomic basis sets. The “chemical accuracy” is almost reached with the TZDP basis (0.034 
eV for LDA and 0.041 eV for PBE) and well reached with QZTP (0.022 eV for LDA and 
0.026 eV for PBE). Here, the so-called basis-set error is much smaller than the errors of 
the energy functionals. For this diatomic molecular test set listed in Table IIIII on average 
PBE overbinds by 0.36 eV and LDA overbinds by 0.75 eV by comparing the converged PW 
results to the experimental ones. 

3. Vibrational frequency 

For a given molecule, the equilibrium bond length and the atomization energy reflect 
the position and depth of the minimum in its potential energy surface, respectively, and 
the vibrational frequency probes the curvature of the potential energy surface around the 
equilibrium geometry. The vibrational frequency can be measured directly by experiment 
and is a powerful probe of structural and bonding characteristics of a molecule. 

In Table IIVI we present the calculated vibrational frequencies for the same molecular 
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test set and basis sets used before. Compared to the PW reference results, the MAE is 
reduced dramatically when the basis set goes from SZ to DZP. Specifically, at the DZ level 
the MAEs are 65 cm -1 for LDA and 47 cm” 1 for PBE, and these numbers are further 
reduced to 6 cm” 1 when the basis set increased from the DZ to DZP. The results indicate 
the importance of polarization functions in an atomic basis set to accurately describe the 
curvature of the potential energy surface. Nevertheless, unlike the tests for bond lengths 
and atomization energies of molecules, increasing the size of basis set does not guarantee a 
better description of these vibrational frequencies. We thus conclude that when CGH basis 
set goes beyond DZ, it still introduces around 6 cm” 1 error for the vibrational frequencies of 
molecules comparing to PW calculation results. Overall speaking, this accuracy is already 
excellent for most practical purposes. 

4- Weak interaction energy 

In the previous tests, we have demonstrated the convergence behavior of our basis sets for 
chemically bonded diatomic molecules. In a sense, the good performance of these atomic ba¬ 
sis sets for the tested molecules is not surprising, since the homo-nuclear diatomic molecules 
are the target systems when generating atomic orbitals. Now we turn to the study of bigger 
and more weakly bonded molecules - the S22 test set.- 0 This test set contains 22 weakly 
interacting molecular complexes that include hydrogen bonding, dispersion interaction, and 
mixed bonding types. Since its inception, the S22 test set has been widely used to benchmark 
computational methods that deal with van der Waals (vdW) interactions. 

Each member in the S22 test set is a molecular dimer that contains two monomers 
interacting with each other. The interaction energy is defined as the difference between the 
total energy of the dimer and the sum of the total energies of the two individual monomers 
in their fully relaxed geometries. In this work, the interaction energies of the S22 molecules 
are calculated using the DFT-D2 method of Grinnne,— specifically PBE-D2, as recently 
implemented in ABACUS. 

The calculations for S22 molecules are still done with the supercell approach with cubic 
boxes that are sufficiently large (up to 50 Bohr side length) to avoid artificial interactions 
between the molecule and it the periodic images. Fig. [3] presents the interaction energy 
differences of the S22 molecules calculated from two methods. The first is PBE-D2 with 
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TABLE IV: Vibrational frequencies (in cm -1 ) of molecules obtained with various sets of atomic 
orbitals, in comparison with PW and experimental results (EXP). Results from both LDA and 
PBE are shown. The MAEs are calculated based on the results from atomic basis set and PW 
basis set. 


Molecules 

SZ 

DZ 

DZP TZDP QZTP 

PW 

EXP 




LDA 




n 2 

1857 2261 2378 

2394 

2391 

2370 2359 

0 2 

1246 1329 1558 

1581 

1582 

1567 1580 

S 2 

517 

675 

717 

719 

719 

721 

726 

F 2 

813 

885 

920 

915 

917 

924 

917 

Cl 2 

498 

524 

561 

560 

562 

567 

560 

B2 2 

471 

382 

341 

341 

339 

332 

325 

h 

345 

234 

199 

199 

201 

206 

215 

Li 2 

306 

331 

342 

340 

340 

339 

351 

Na 2 

129 

141 

151 

148 

150 

147 

159 

CO 

1631 

1998 2130 

2130 

2130 

2130 2170 

LiH 

1107 1286 1327 

1330 

1334 

1341 

1406 

MAE 

207 

69 

6 

8 

7 

/ 

/ 




PBE 




n 2 

1938 2397 2374 

2378 

2380 

2365 

2359 

o 2 

1297 1414 1567 

1567 

1566 

1572 

1580 

S 2 

546 

684 

721 

718 

716 

723 

726 

F 2 

758 

872 

914 

917 

918 

920 

917 

Cl 2 

427 

517 

562 

566 

564 

563 

560 

B2 2 

459 

374 

339 

340 

333 

329 

325 

h 

318 

249 

224 

224 

222 

219 

215 

Li 2 

313 

335 

345 

345 

344 

342 

351 

Na 2 

132 

145 

151 

151 

150 

149 

159 

CO 

1728 2075 2147 

2149 

2147 

2144 2170 

LiH 

1104 1296 1332 

1339 

1341 

1348 

1406 

MAE 

192 

48 

6 

6 

5 

/ 

/ 
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FIG. 3: (Color online) The PBE-D2 interaction energies of the S22 molecules obtained with in¬ 
creasing LCAO basis sets, with reference to the CCSD(T) results (the zero dash line). The results 
from PW basis set and Gaussian TZV(2df,2pd) (Ref. 
are used to guide the eye. 


461 ) basis set are shown for comparison. Lines 


various atomic basis sets and PW basis set, while the second is the coupled-cluster theory 

es [CCSD(T)]^ in the complete basis set limit. 


48L obtained using the Gaussian TZV(2df,2pd) 


with singles, doubles, and perturbative trip 
The results of Grimme as reported in Ref. 
basis set,— are also plotted in Fig. [3] for comparison. The CCSD(T) reference is indicated in 
Fig. [3] by the dash line at energy zero. It can be seen that by increasing the number of atomic 
orbitals, the results from atomic orbitals systematically approach the PW results, and on 
average get closer to the CCSD(T) references. This trend again validates the transferability 
of our atomic basis set. The quality of the Gaussian TZV(2df,2pd) is somewhere between the 
qualities of DZP and TZDP basis sets. Regarding the performance of the PBE-D2 method 
itself, it can be concluded that the ABACUS package with atomic basis set is very suitable 
for describing vdW forces. 


C. Solids 

In Sec. IIII B1 we have validated the accuracy of ABACUS with its associated atomic 
basis sets for molecular properties. Here we turn to the test of crystalline solids. This is 
a crucial check for the transferability of the atomic basis sets, because they are generated 
from diatomic systems and are now used to test solid systems. In analogy to the bond 
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lengths, atomization energies, and vibrational frequencies for molecules, here we benchmark 
the lattice constants, cohesive energies, and bulk moduli for solids. 

Twenty crystalline solids are chosen as a test set that covers group III-V and group IV 
semiconductors, alkaline and alkaline-earth metals, alkaline chloride, as well as transition 
metals. The lattice constants and bulk moduli of group III-IV and group IV semiconductors 
obtained using ABACUS have already been reported in Ref. [ 9 ]. They are included here for 
completeness. Since here we are mainly interested in the convergence behavior of the LCAO 
basis sets instead of the performance of the exchange-correlation functionals, in this test 
only the LDA is used for simplicity, except for Fe we choose PBE which yields the correct 
body-centered-cubic (bcc) ground-state crystal structure. For transition metal elements, 
TZ (3s3p3d) and QZ (4s4p4d) basis sets are used because adding the polarization functions 
(/ orbitals) does not make a noticeable difference. Note that these TZ/QZ basis sets are 
grouped together with other TZDP/QZTP basis sets for the statistical error analysis of 
energy differences in Table IVllVIIl Similar to the molecular test case, PW basis results are 
also reported in these tables as references using the same energy cutoffs and pseudopotentials 
as in LCAO calculation. The MAEs are obtained between the results from atomic basis sets 
and the PW basis set. Both simulations were carried by ABACUS. 

Monkhorst-Pack (MP) k-point meshes are used for the BZ sampling. Specifically, a 
4x4x4 k-point mesh is used for semiconductors range from GaAs to Ge. A 6 x 6 x 6 
k-point mesh is used for LiH, NaCl, and MgO. A denser 10 x 10 x 10 k-point mesh is 
used for metals include bcc Na, face-centered-cubic (fee) Al, fee Cu, and bcc Fe. Finally, a 
10 x 10 x 7 k-point mesh is used for hexagonal-close-packed (hep) Ti. For transition metals, 
the semi-core electrons are treated as valence electrons for Ti, but not for Fe and Cu. For 
the geometry optimization of Ti hep structure, the optimal c/a ratio is 1.590 by determining 
the minimum of a two dimensional (a, c) energy landscape, with a and c being the lengths 
of the lattice vectors in the hep structure. 


1. Lattice constants 


Table [V] shows the lattice constants of tested crystalline solids obtained with our hierar¬ 
chical LCAO basis s ets, in comparison with the PW results, as well as experimental data as 


collected in Refs. 
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As illustrated in Table |VI the calculated lattice constants of solids 
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converge systematically with respect to the LCAO basis set size. For the tested solids, an 
MAE of 0.02 A can be achieved with the DZP basis set, while an MAE of 0.01 A can be 
reached at the level of TZ(DP) basis set. This accuracy is comparable to that achieved for 
bond lengths of molecules, and confirms that the structural properties are well described 
with the LCAO basis sets. In addition, it can be seen that the convergence behavior of the 
basis sets for simple and transition metal elements are very similar to that for group III-V 
and group IV elements. 

2. Cohesive energies 

The cohesive energies of the tested solids at their equilibrium lattice constants are pre¬ 
sented in Table IVII Again, the general trend that the LCAO results systematically approach 
the PW limit can be observed. It is also interesting to point out that the LCAO basis sets 
show even better convergence behavior for cohesive energies of solids than atomization ener¬ 
gies of molecules. For example, at the DZP level, the MAE is only 0.04 eV for the cohesive 
energies of solids, compared to 0.09 eV for the atomization energies of molecules. The same 
behavior holds for higher levels of basis sets such as TZDP and QZTP, where the MAEs of 
0.01 eV for solids are also twice smaller than those of molecules. 

3. Bulk Moduli 

The bulk modulus is another key property of solids, reflecting the variation of the ground- 
state energy with respect to the unit cell volume around the equilibrium state. In Table I VI It 
the bulk moduli of the tested solid are presented. Compared to the PW results, the SZ basis 
set yields a large MAE of 21.2 GPa. However, this is quickly reduced to 5.8 GPa at DZ level 
and 2.1 GPa at DZP levels, this accuracy is sufficiently accurate for most purposes. For basis 
sets larger than the DZP basis set, the MAEs are further reduced to around 1.0 GPa, which 
are highly accurate. This convergence behavior of the LCAO basis sets is similar to what 
was observed for the calculated vibrational frequencies for tested molecules in Sec. IIII Bl In 
addition, similar to the cases of testing lattice constants and cohesive energies, the quality 
of the basis sets is equally good for transition metal elements as for main-group elements. 






TABLE V: Lattice constants (in A) of 20 solids obtained from various LCAO basis sets, compared to 


the PW and experimental (EX 
effects) are taken from Refs. y 


results. Experimental data (corrected for zero-point anharmonic 


and 


51 


Solid 

SZ 

DZ 

DZP TZ(DP) 

QZ(TP) PW 

EXP 

GaAs 

5.63 

5.59 

5.57 

5.55 

5.55 

5.54 

5.64 

GaP 

5.43 

5.40 

5.35 

5.33 

5.34 

5.34 

5.44 

GaN 

4.30 

4.35 

4.40 

4.41 

4.41 

4.41 

4.52 

InAs 

6.01 

5.97 

5.97 

5.96 

5.96 

5.96 

6.05 

InP 

5.84 

5.81 

5.79 

5.78 

5.78 

5.78 

5.86 

InSb 

6.46 

6.40 

6.39 

6.39 

6.39 

6.38 

6.47 

AlAs 

5.76 

5.70 

5.62 

5.61 

5.61 

5.60 

5.65 

A1P 

5.55 

5.51 

5.42 

5.41 

5.41 

5.40 

5.45 

AIN 

4.39 

4.33 

4.29 

4.27 

4.27 

4.27 

4.37 

C 

3.63 

3.55 

3.51 

3.50 

3.50 

3.50 

3.55 

Si 

5.59 

5.53 

5.41 

5.40 

5.40 

5.40 

5.42 

Ge 

5.73 

5.69 

5.64 

5.61 

5.61 

5.61 

5.64 

LiF 

4.15 

3.94 

3.90 

3.88 

3.88 

3.88 

3.97 

NaCl 

5.62 

5.54 

5.51 

5.50 

5.50 

5.50 

5.57 

MgO 

3.96 

4.02 

4.06 

4.07 

4.07 

4.07 

4.19 

Na (bcc) 

3.51 

3.94 

4.05 

4.06 

4.06 

4.08 

4.21 

A1 (fee) 

3.71 

3.81 

3.87 

3.90 

3.91 

3.93 

4.02 

Cu (fee) 

3.26 

3.51 

/ 

3.52 

3.53 

3.53 

3.60 

Fe (bcc) 

2.45 

2.58 

/ 

2.72 

2.72 

2.73 

2.86 

Ti (hep) 

2.61 

2.78 

/ 

2.80 

2.80 

2.81 

2.96 

MAE 

0.16 

0.07 

0.02 

0.01 

0.01 

/ 

/ 


D. Si(100) surface reconstruction 

In the previous tests for molecules and solids, we have established the reliability of the 
ABACUS package and the accuracy of its associated LCAO basis sets. Here we test a 
more “challenging” problem - the reconstruction of the Si(100) surface. This surface is tech- 
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TABLE VI: Cohesive energies (in eV/atom) of 20 solids at their equilibrium lattice constants 
obtained from various LCAO basis sets, c omp ared to the PW and experimental (EXP) results. 


The experimental data are taken from Ref. 
as done in Ref. [vj 


52l . corrected for the zero-tenrperature vibration effect 


Solid SZ DZ DZP TZ(DP) QZ(TP) PW EXP 


Os. As 

3.14 

3.73 

3.99 

4.01 

4.01 

4.01 

3.34 

GaP 

3.42 

4.01 

4.16 

4.18 

4.18 

4.18 

3.61 

GaN 

4.33 

5.03 

5.25 

5.27 

5.28 

5.28 

4.55 

InAs 

2.72 

3.47 

3.78 

3.82 

3.83 

3.84 

3.08 

InP 

3.15 

3.95 

4.17 

4.21 

4.21 

4.22 

3.47 

InSb 

2.47 

3.20 

3.48 

3.53 

3.55 

3.55 

2.81 

AlAs 

3.69 

4.18 

4.34 

4.57 

4.37 

4.37 

3.82 

A1P 

4.08 

4.57 

4.75 

4.77 

4.77 

4.77 

4.32 

AIN 

5.34 

6.29 

6.46 

6.48 

6.49 

6.49 

5.85 

C 

8.06 

8.53 

8.72 

8.74 

8.75 

8.75 

7.55 

Si 

4.08 

4.76 

5.20 

5.22 

5.24 

5.25 

4.68 

Ge 

3.75 

4.24 

4.55 

4.58 

4.59 

4.60 

3.92 

LiF 

4.17 

4.62 

4.88 

4.90 

4.90 

4.90 

4.46 

NaCl 

2.49 

3.20 

3.51 

3.52 

3.52 

3.56 

3.34 

MgO 

4.63 

5.46 

5.72 

5.74 

5.74 

5.77 

5.20 

Na(bcc 

) 0.98 

1.10 

1.25 

1.27 

1.27 

1.27 

1.12 

Al(fcc) 

3.18 

3.79 

3.95 

3.96 

3.96 

3.97 

3.43 

Cu(fcc 

) 3.65 

4.17 

/ 

4.46 

4.48 

4.50 

3.52 

Fe(bcb 

) 5.54 

5.18 

/ 

5.10 

5.08 

5.08 

4.30 

Ti(hcp) 4.82 

5.24 

/ 

5.33 

5.35 

5.36 

4.88 

MAE 

0.85 

0.26 

0.04 

0.01 

0.01 

/ 

/ 


nologically important for fabricating silicon-based devices and has been intensively studied 
both theoretically and experimentally. Different reconstruction models for the Si(100) sur¬ 
face have been proposed in the past,—“— and the energy hierarchy among these different 
reconstructions have been examined by both DFT (LDA) calculations^ and by quantum 
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TABLE VII: Bulk Moduli (in GPa) of 20 solids at their equilibrium lattice constants obtained from 


various LCAO 
data are taken 


basis sets, compared to the PW and experimental (EXP) results. The experimental 
from Refs. Q except for Fe and Ti [Ref. 53]. 


Solid 

SZ 

DZ 

DZP TZ(DP) QZ(TP) 

PW 

EXP 

Get As 

66.5 

68.3 

75.1 

76.8 

76.8 

77.2 

76 

GaP 

80.7 

82.3 

89.8 

94.1 

93.7 

92.4 

89 

GaN 

187.5 

203.8 

209.6 

207.3 

207.9 

208.2 

210 

InAs 

59.5 

65.8 

67.2 

67.3 

67.2 

67.9 

60 

InP 

74.3 

75.8 

77.2 

79.5 

80.3 

81.4 

71 

InSb 

45.0 

48.6 

49.9 

49.9 

50.2 

51.4 

47 

AlAs 

62.9 

71.4 

74.2 

74.6 

74.6 

75.3 

77 

A1P 

70.1 

78.5 

86.2 

87.4 

87.7 

88.3 

86 

AIN 

192.9 

202.7 

205.3 

208.4 

207.8 

207.2 

202 

C 

428.7 

442.8 

459.3 

459.4 

458.5 

458.2 

443 

Si 

72.3 

78.2 

93.9 

93.9 

93.5 

93.3 

99 

Ge 

53.4 

67.3 

71.1 

72.0 

72.8 

73.5 

76 

LiF 

44.1 

58.5 

60.2 

61.1 

61.3 

62.6 

70 

NaCl 

14.7 

22.1 

29.4 

29.6 

29.8 

30.1 

27 

MgO 

128.6 

165.4 

167.7 

171.9 

171.3 

170.2 

165 

Na(bcc) 

5.9 

6.8 

7.0 

7.0 

7.0 

7.1 

8 

Al(fcc) 

70.2 

74.9 

75.8 

76.0 

76.0 

76.2 

79 

Cu(fcc) 120.8 

147.3 

/ 

149.4 

150.2 

150.6 

142 

Fe(bcc) 

96.8 

160.4 

/ 

161.1 

161.8 

165.0 

170 

Ti(hcp) 

52.8 

113.3 

/ 

122.4 

122.9 

124.5 

110 

MAE 

21.2 

5.8 

2.1 

1.2 

0.8 

/ 

/ 


Monte Carlo^ calculations. The small magnitude of energy differences between these recon¬ 
structions offers an excellent testing ground for ABACUS with the LCAO basis sets. 

Three reconstructions of the Si(100) surface are considered in the work, namely the 
p(2 x 1) symmetric [denoted as p(2 x l)s below], p(2 x 1) asymmetric [p(2 x l)a], and 
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(a) (b) 



FIG. 4: (Color online) Side view of three reconstruction structures for the Si(100) surface: (a) 
p(2xl)s; (b)p(2xl)a; (c)p(2x2); (d) zoom-in of the top three layers of the p(2x l)a reconstruction 
structure. 

p(2 x 2) reconstructions, as shown in Fig. 0](a-c). It is well known that the Si atoms on 
the surface layer form dimers to lower the energy of the system by removing one of the 
two dangling bonds.— In the p(2 x l)s reconstruction structure, the two atoms on the top 
layer come closer symmetrically, with bond length between them becomes slightly shorter 
than the nearest neighbor distance in the Si bulk. In the p(2 x l)a case, the dimers buckle 
out of the Si(100) surface. Finally, in the p(2 x 2) case, the buckled dimers change their 
orientations alternatively (see Fig. 01(c)). Note that another c(2 x 4) reconstruction exists 
where the adjacent buckled dimer rows orientate oppositely. However, the energy lowering 
of this reconstruction is almost identical to that of p(2 x 2) when using DFT with LDA,— 
thus this fourth reconstruction structure is not considered in this work. 

Next we describe the computational setup of our simulations. ABACUS is used with 
both LCAO DZP basis set and PW basis set. Also, the Quantum ESPRESSO (QE) 
package^ is used, which serves as an independent check for the ABACUS package. The 
same pseudopotential for Si is used for all three simulations. We model the Si(100) surface 
by using a repeated slab which contains 12 atomic layers, because the structural distortion 
below the surface layer extends 4-5 layers into the bulk as noticed before.— The central two 
layers are fixed during the structural relaxation, and only atoms in the outermost five layers 
on each side are allowed to relax. The conjugate gradient algorithm is used for structural 
relaxation with a force threshold of 0.01 eV/A. The slabs are separated by a vacuum of 
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30 A thick to avoid artificial interactions between neighboring slabs. When computing the 
energy differences of the p(2 x l)s and p(2 x l)a reconstructions with respect to the ideal 
surface, the (2x1) unit cell in the x-y plane is used, while in the case of the p(2 x 2) recon¬ 
struction, the (2x2) unit cell is used for both relaxed and ideal surfaces. In all calculations, 
a (6 x 6 x 1) k-point mesh is used. 

Table lYUTl reports the energy lowings of the p(2 x l)s, p(2 x l)a, and p(2 x 2) recon¬ 
structions with respect to the ideal Si(100) surface. For the ideal Si(100) surface, we fix the 
surface structure using the bond lengths in Si bulk. Table lYUTl shows an excellent agreement 
between the ABACUS/PW results and the QE results. First, from QE calculations, the 
successive energy decreases of the three relaxation steps, i.e., from the ideal surface to the 
p(2 x l)s, p(2 x l)a, and p(2 x 2) reconstructed surfaces, are 1.504 eV, 0.086 eV, and 0.063 
eV per Si dimer, respectively. These numbers can be compared to the 1.80 eV, 0.12 eV, and 


0.05 eV ones as reported in Ref. |58|. The differences between results from QE and Ref. 


58 


are presumably due to the usage of different pseudopotentials, energy cutoff and BZ k-point 
sampling. However, both methods lead to the same energy orderings of the three reconstruc¬ 
tions with respect to the ideal Si(100) surface. Also it is assuring that ABACUS with the 
PW basis set gives almost identical results compared to QE calculations as can be seen in 
Table IVTTn Finally, ABACUS with DZP basis set yields 1.481 eV, 0.078 eV, 0.070 eV per 
Si dimer energy lowerings, which are in excellent agreement with the other two PW results. 


TABLE VIII: The energy lowerings per dimer (in eV) of three different reconstructions of the 
Si(100) surface, with respect to the ideal Si(100) surface. 


Surface ABACUS/DZP ABACUS/PW Quantum ESPRESSO 


p(2 x 

l)s 

-1.481 

-1.505 

-1.504 

p(2 x 

l)a 

-1.559 

-1.591 

-1.590 

T3 

to 

X 

2) 

-1.629 

-1.652 

-1.653 


Next we examine the relaxed structures obtained by the three methods. We choose the 
p(2 x 1) reconstruction because its structure distortion is most pronounced among the three. 
Specifically, we look at the pentagon pattern formed by the atoms from the top three layers 
of the p(2 x 1) structure, as illustrated in Fig. |4](d). In Table IDO we listed the bond lengths 
and bond angles (the five edge lengths and angles) of the pentagon as yielded by the three 
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types of calculations. Again, an almost perfect agreement is obtained between the results 
obtained by the ABACUS/PW calculations and the QE calculation. In this case, the bond 
lengths from the two methods agree within 0.001 A, while the bond angles agree within 0.1 
degree (Deg). The results from the DZP basis set are also in excellent agreement with the 
PW results. The bond lengths from former calculation are slightly longer than the latter 
ones by 0.003-0.007 A, and the bond angles differ by 0.4 Deg at most. In conclusion, for all 
the tested properties of the three reconstructions of Si(100) surface, the ABACUS package 
with the DZP basis set gives accurate results compared to PW results, demonstrating again 
the ability of ABACUS to do reliable surface calculations. 

TABLE IX: The bond lengths between the atoms in the top three layers of the p(2 x l)a recon¬ 
struction. aij is the bond length (in A) between the i-th and j’-th atoms as shown in Fig. IU(d). 0i 
is the bond angle (in Deg.) formed by the two bonds sharing the i-th atom. 

Parameter ABACUS/DZP ABACUS/PW Quantum ESPRESSO 


ai2 

2.366 

2.360 

2.360 

&23 

2.337 

2.334 

2.335 

a34 

2.405 

2.401 

2.401 

&45 

2.307 

2.303 

2.302 

&51 

2.290 

2.283 

2.283 

0i 

90.13 

89.88 

89.94 

0 2 

104.55 

104.24 

104.21 

0 3 

100.19 

99.51 

100.49 

04 

81.34 

80.42 

80.48 

05 

121.80 

122.15 

122.09 


E. N defect in bulk GaAs 

Real materials contain various types of defects, and their presence greatly affects, and 
often decisively determines the physical properties of materials. In recent years, DFT-based 
first-principles approaches have emerged as powerful tools for describing and understanding 
of point defects in solids,— and is becoming an indispensable complement to experiments 
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FIG. 5: Calculated band gap of GaAsi-^Nj, as a function of the N concentration x. The solid 
square on the left corresponds to the band gap of GaAs bulk. The number of atoms used in 
a supercell is labeled for each blank black point. The lines connecting the data points are for 
guidance. 

that are often difficult and expensive to carry out. Since usually a large supercell is required 
to model defects in solids, the LCAO technique, which scales favorably with the system size, 
is a preferable choice for simulating the electronic structures of defects. 

As a specific example, we employ the ABACUS package to study the group III-V 
semiconductor alloy GaAsi-^Na,, where x is the concentration of the nitrogen impurity. Both 
GaAs and GaN are technologically important materials in semiconductor industry, hence 
there has been considerable interest in alloying GaAs and GaN to obtain optoelectronic 
properties that bridge nitrides and arsenides. The simulations are done again with the 
supercell approach with successively increasing cell size, containing 16, 32, 64, 128, 256, 
512, and finally 1024 atoms. These correspond to approximate x values of 0.125, 0.063, 
0.031, 0.016, 0.008, 0.004, and 0.002 respectively. Only T-point is used in the simulations of 
supercells with 512 atoms and more. For smaller supercells, finite k-point meshes are used. 
Specifically, they are 1x1x2, 1x2x2, 2x2x2, 2x2x4, and 2x4x4 k-point meshes 
for system sizes ranging from 256 atoms to 16 atoms, respectively. The LDA is chosen to 
be exchange-correlation functional. The internal geometry of each supercell is fully relaxed. 
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1. Band gap of GaAs\- x N x 

The photoluminescence edge of GaAsi-^N^ for small x shows an unexpected redshift, 
instead of a bluesliift— as inferred from the linear interpolation between the two endpoints 
(1.4 eV for GaAs and 3.8 eV for GaN). The narrowing of the alloy band gap E g (x) from the 
composition-weighted linear average value E g (x) = xE s ( 0) + (1 — x)E g (l) is called the band- 
gap bowing, which is a general feature of semiconductor alloys and can often be described as 
5E g (x) = bx(x — 1) with b being the optical bowing parameter.— In GaAsi-^Nj,, the bowing 
effect is extremely pronounced with a large b coefficient of about 16 eV, in stark contrast 
with most isovalent semiconductor alloys that have a b value of only a fraction of eV. This 
not only leads to a pronounced band-gap narrowing for small x (x < 0.015), but also even 
a closing of the band gap for large x values. 

Such a peculiar behavior has been analyzed in details by several theoretical studies^ - — 
based on first-principle calculations, and the strong bowing effect has been attributed to 
the substantial lattice mismatch (> 20%) between GaAs and GaN—and the formation of 
spatially separated and sharply localized band-edge states.- ‘ However, due to the limitation 
of the computational resources, the previous calculations of GaAsi_ x N x focused only on the 
alloy regime with x =0.25, 0.50 and 0.75,—or were based on empirical pseudopotentials.— 

The ABACUS package with LCAO basis sets allows us to reach large systems with x 
as low as 0.002. The calculated band gaps of GaAsi_ x N x systems are shown in Fig. [5j The 
blank square points from right to left are the band gaps obtained from supercell calculations 
using 16, 32, 64, 128, 256, 512, and 1024 atoms, respectively. Note that the x = 0 limit 
represents the calculated band gap of bulk GaAs, which is 1.13 eV from LDA. Although LDA 
based calculation in general underestimates the band gap of materials, it reproduces very 
well the experimental finding that the band gap of GaAsi_ x N x gets continuously reduced as 
x increases. We note that, the geometries of the supercells in this work are chosen in a special 
way, i.e., by doubling the superccll size successively along the x, y, and z directions. Thus 
the exact behavior of the plot in Fig. [5] might slightly changes if the shape of the supercells 
is chosen differently. However, this should not alter the general trend we obtained. We also 
find that the closing of the band gap happens when x is larger than 0.125. 


36 



io " 3 io " 2 io " 1 


N concentration * 

FIG. 6: (Color online) The N defect formation energy of GaAsi-zNj, as a function of the N 
concentration x. Two LCAO basis set setups are used in the ABACUS calculations: DZP for all 
elements (black circles); DZP for Ga and As, and TZDP for N (red diamonds). Results obtained 
from VASP (blue squares) and ABACUS/PW basis (violet triangles) calculations are shown for 
comparisons. The energy cutoff is set to 500 eV in VASP calculations. 

2. Formation energy of GaAs\- x N x 

The next relevant issue is the stability of the GaAsi-^N^ alloy. A key quantity here, for 
small x, is the formation energy of an N defect, which is defined as 

Ef = E( Ga n As n _iN) — E(G a n As n ) + //(As) — //(N), (31) 

where E( Ga n As n ) is the energy of a Ga n As n superccll containing n GaAs formula units, 
i?(Ga n As ra _iN) is the energy of the above supercell but with one As atom replaced by an 
N impurity atom. //(As) and //(N) are the chemical potentials for the As atom and the 
N atom, respectively. The actual value of the atomic chemical potential depends on the 
chemical conditions in the experiment. 

Here we consider two situations: the As-rich & N-rich condition and the As-poor & N-rich 
condition. In the first one, //(As) = // rich (As) is chosen to be the energy per atom in the 
yellow arsenic crystal form. In the second case, // poor (As) is given by adding the formation 
enthalpy of bulk GaAs to // nch (As). The formation enthalpy of bulk GaAs bulk is taken here 
as the atomization energy of GaAs per atom at zero temperature (2.00 eV). Finally, /z nch (N) 
is chosen to be one half of the total energy of the N 2 molecule. Obviously, the corresponding 
N concentration in the supercell Ga n As n _iN is given by x = 1/n. We would like to point 
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out that, by using Eq. (j3Tj) . the N impurities are regarded as point defects, and the possible 
interactions between neighboring N defects are neglected. 

In Fig. O the formation energy of an N defect is plotted as a function of the N concentra¬ 
tion x for both As-rich (upper panel) and As-poor (lower panel) conditions. Calculations are 
done using the ABACUS package with two sets of LCAO bases: 1) DZP for all elements; 
2) DZP for Ga and As, and TZDP for N. The results are compared to those obtained from 
the ABACUS/PW calculations, and those obtained by the Vienna Ab-initio Simulation 
Package (VASP) package with projected augmented wave method.— 1 ^ Fig. [6] shows that 
under the As-rich condition, the N defect formation energies of GaAsi-^Nj. are positive for 
all N compositions, indicating the very low probability to dope N into GaAs under such con¬ 
dition. ffowever, under the As-poor condition, the formation energies become negative for 
small N concentrations, which is consistent with the experimental finding that GaAsi-zhU 
alloy can only be formed for narrow composition range near the endpoints.— 

The formation energy curves from ABACUS/DZP calculations follow the same trend 
as the ones from ABACUS/PW calculations for both As-rich and As-poor cases in Fig. [6l 
ffowever, there are noticeable differences of calculated formation energies between the two 
methods. Specifically, the formation energies associated with the DZP basis set are under¬ 
estimated by 0.2 to 0.3 eV compared to the ones from PW calculations. This is due to the 
fact that /r( N ) is calculated by half of the N 2 energy (cf. Eq. [3T]) , and the N 2 total energy 
calculated with the DZP basis still has an appreciable difference from the PW reference 
result, as shown in Fig [2] In fact, by just increasing the N basis set from DZP to TZDP, the 
corresponding formation energy difference is largely improved to within 0.1 eV compared 
to PW results for all cases, as shown in Fig. [6] Finally, comparing the ABACUS/PW 
results and the VASP results reveals that there exists an appreciable difference between the 
norm-conserving psedudopotenital treatment and the projector augmented wave method. 
Despite these differences, the general trend of the dependence of the formation energy on 
the N concentration is well reproduced within all calculations. 

In conclusion, the study of GaAsi-zN/; alloy illustrates that the ABACUS package with 
its LCAO basis sets can be used for reliable defect calculations. We would also like to note 
that, for a thorough and faithful treatment of the phase stability problem of the GaAsi_ 3 ,N x 
alloy, one needs to consider the influence of finite temperatures and include the entropy 
effect, but this goes beyond the scope of the present work. 
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IV. SUMMARY 


To summarize, in this paper we introduce a comprehensive first-principles package, named 
ABACUS , in which both plane waves and efficient localized atomic orbitals can be used for 
electronic-structure calculations. In particular, we present the mathematical foundation and 
numerical techniques behind the atomic-orbital-based implementation within this package. 
The performance and reliability of the ABACUS package were benchmarked for a variety 
of systems containing molecules, solids, surfaces and defects. Furthermore, we show that 
the hierarchial atomic basis sets generated with the CGH scheme allows for a systematic 
convergence towards the plane-wave accuracy, and the DZP basis set offers an excellent 
compromise between accuracy and the computational load, and can be used in production 
calculations for most purposes. The package is currently under active development, with 
more features and functionalities are being implemented. With all these efforts, we expect 
the ABACUS package will become a powerful and reliable tool for simulations of large-scale 
materials. 
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Appendix A: Appendix 


1. Two-center integrals 


The overlap matrix and the kinetic energy matrix can be efficiently calculated by two- 
center integral technique,— which has been described with full details in Ref. (J Here we 
briefly introduce this algorithm. The overlap matrix is written as 


S( R) = / C(r)0 v (r-R)dr, 


(Al) 


which can be further written as,- 

2 Imax l 


S < R ) = EE s t 




IhJR). 


(A2) 

1=0 m=—l 

Here (l v ) and (m u )are angular momentum and magnetic quantum numbers for orbital 
/i {v). The radial part is 


Sipm ll ,i v m v ,im{R) — 4vri 1 I ji(kR) f^k) f u (k)k 2 dk . 


(A3) 


Here f^(k) and f v {k) are one dimensional Fourier transform of the radial atomic functions 
introduced in Eq. [151 


uw = f r2 jiA kr )f^ r ) dr ■ 


(A4) 


Si li m tll i v m Vl im(R) can be tabulated with dense sampling of distances between and <f> v . 


Cl 




in Eq. IA2I is called the Gaunt coefficient, 


r-2 -it 


/ sin (9)d0 Y, m (9, 0)^(0, <j>)Y lm {0,<l>)d4, 


(AS) 


JO Jo 

which can be calculated and tabulated recursively from the Clebsch-Gordan coefficients.— 
For convenience, the real spherical harmonic functions are actually used in ABACUS. The 
overlap matrix formed by an atomic orbital |0 M ) and a non-local projector \Xaimn) [See Eq. 
can be calculated exactly in the same way. 

The kinetic energy operator matrix element, 


UR)= f 0;(r)(-iv 2 )0„(r-R)* 


(A6) 


41 



is slightly different. This term can be calculated in a similar way as S'(R) by replacing 
With 

/*oo 

Ti^Mmi-R) = 2vrr z / Mktyf^Mk^dk. (A7) 

Jo 

Because and T i)irnii ^ mvM (R) are all independent of the coor¬ 

dinates of atoms, they can be tabulated at the beginning of the DFT calculations once for 
all. For any given distance between two atoms, the corresponding overlap matrix elements 
and kinetic energy matrix elements can be calculated efficiently by interpolation method. 


2. Grid-based techniques 


The Hamiltonian matrix elements V^° c are evaluated on a uniform real space grid, with 
both atomic orbitals and local potentials presented on each grid point. The local potential 
is, 

V loc (r) = V L (r) + V H (r) + V xc {r), (A8) 


where V L (r) = E R Em^( r “ - R) is the summation of all the local pseudopotentials 

for z-th atom of element type a. Plane wave basis and Fast Fourier Transform (FFT) 
techniques are used here to efficiently evaluate V L {r) and V H {r) on the grid. Because the 
local pseudopotential v^(r) has a fairly long tail in real space, it is inefficient to calculate 
V L {r) directly on a real space grid. Therefore, V L (G) is first calculated in reciprocal space 
as 


V L {G) = Y 1 S a (G)v^G), 

Q 


(A9) 


where S a ( G) = Y2i e ~ lG ' Tai is the structure factor. An FFT is carried out to bring U L (G) 
back to real space. From our tests, this construction processes of V L {r) only take a small 

br systems containing thousands of atoms. This 
, where the short-ranged neutral atom potentials 
are used. Using the same set of plane wave basis, the Hartree potential is also first evaluated 
in reciprocal space and then be brought back to real space by using an FFT. The full formula 
of Hartree potential is 


portion ol total computational time, even 
is different from the method used in Ref. 6 


V h (t)=4tY, 

G^O 


P(G) jG-r 

IGI 2 


(A10) 
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3. Force calculations 


As shown in the main text, the forces evaluated from the basis of atomic orbitals have 
four contributions, 


F 


pFH _|_ ppulay _|_ portho _|_ pEwald 


(All) 


which are the Feynman-Hellmann force, the Pulay force, the force due to nonorthogonality of 
the atomic orbitals, and the Ewald force due to the Coulomb interactions between the ions. 
The Ewald force can be calculated analytically using the Ewlad summation techniques,— 
and therefore not discussed here. We discuss here the techniques to evaluate the rest three 
terms as follows. 

1. Feynman-Hellmann force, 


(JT r, 

R [iv 


(A12) 


In Hamiltonian H , only the local and non-local pseudopotentials explicitly depend on 
the coordinates of ions. Thus we can further break F™ into two terms related to pseu¬ 
dopotentials. The first one is related to the local pseudopotential, and can be evaluated in 
reciprocal space, 

iGe~ lG ' Tai V L {G)p*{G). (A13) 

Gyo 

This method has been shown to be accurate and fast.— The second term involves the non¬ 
local pseudopotential, 


c=-EEw?-i« 

T-* ^ rV 

R [iv 


R [iv Imn 


dXailmn \ / \ ± \ > / i i \ / ^Xailmn i , \ 

/ \Xcx.ilmn \ fivo) “1“ \0/iR| XmZmri/ \ 10^0/ 


cIt a 


dr n 


(A14) 


A non-local pseudopotential projector Xaiimn is also a one dimensional numerical orbital like 
atomic orbital. Therefore, the above equation can be evaluated efficiently using two-center 
integral technique introduced before. 

2. Pulay force 

The existence of Pulay force is due to the fact that the basis set is not complete, 


rppulay _ 
r ai 


EE 

R p,u 


R 

dr ni 


\H\<M + ('WflltrA ■ 


(A 15) 
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The kinetic energy term and the non-local pseudopotential term in F^ lay can also be cal¬ 
culated by the two center integral technique, whereas the local potential term is evaluated 
by the grid integral technique. These steps are similar to the calculations of Hamiltonian 
matrix, except one needs to substitute and <p v with and To evaluate the deriva- 
tives of an atomic orbital with respect to the coordinates of the atoms, one needs to deal 
with both radial atomic orbital and spherical harmonic function parts. The radial part can 
be calculated numerically on a one dimensional grid while the second part can be obtained 
analytically by using the real spherical harmonic functions. 

3. The force arises from the fact that atomic orbitals are not orthogonal, 

*5-=-EE e^) 9 Y K) > ( A16 ) 

Ol~ai 

R fiu 


where 


^ ' /nk£nkC Mn ,k c ru/,ke 


—zk-R 


(A17) 


nk 


is the element of “energy density matrix” .— The derivative of overlap matrix with respect to 
atomic coordinates is, 

riif ris riii ds iri 

(A18) 


SS^R) _ <9S^(R) _ dS^(R) 


dr v 


dr u 


dD 


where and r v are the atomic coordinates for orbital /i and u, and D = R + r v — r M , is the 
distance between two orbitals. The further expansion of this term is 


dS^R) 

dD 


£ 

lm 


d 

dD 


D~ l S> 




(D) 




d 


^ ^ D Sl^m^,l u m u ,lm{D)Gi^ rn ^^i urnu ^i rn ^m(D)-D 

lm 


(A 19) 


In order to make YJ m (D) analytical at the origin, here it is multiplied by D l . 


d 

dD 

d 

dD 


D~ l S) 




can be calculated numerically using interpolation method while 


Y lm (D)D 


can be calculated analytically by using real spherical harmonic functions. 
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